Skip to content

Comments

feat: add crates/transcript with TranscriptAccumulator#4114

Merged
yujonglee merged 2 commits intomainfrom
devin/1771554914-crates-transcript
Feb 20, 2026
Merged

feat: add crates/transcript with TranscriptAccumulator#4114
yujonglee merged 2 commits intomainfrom
devin/1771554914-crates-transcript

Conversation

@devin-ai-integration
Copy link
Contributor

feat: add crates/transcript with TranscriptAccumulator

Summary

Extracts the TranscriptAccumulator into a new standalone crates/transcript shared crate — the crate-only portion of #4064, with no plugin or desktop app integration.

The accumulator processes streaming ASR StreamResponses into clean, deduplicated transcript words using a two-level design:

  • Within a response: assemble aligns tokens to the transcript string (sole oracle for word boundaries — no timing heuristics)
  • Across responses: stitch uses a 300ms timing heuristic to merge words split across final responses (e.g. Korean particles)

Per-channel state handles watermark-based dedup, partial word splicing, and held-word stitching. 37 unit tests pass, including fixture replay tests against real Deepgram and Soniox data (English + Korean).

Nothing depends on this crate yet — integration into plugin-listener/plugin-listener2 is a follow-up.

Review & Testing Checklist for Human

  • Verify spacing_from_transcript greedy-forward find handles repeated substrings in transcripts correctly (e.g. "the the") — mismatch could silently assign wrong spacing
  • Review the 300ms threshold in should_stitch — this is the heuristic for merging words split across responses; confirm it works for your provider mix
  • Confirm splice boundary conditions (<= / >=) don't drop or duplicate words at exact timestamp boundaries
  • Run cargo test -p transcript locally to confirm fixture tests pass against current hypr-data

Notes

devin-ai-integration bot and others added 2 commits February 20, 2026 02:37
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI' or '@devin'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@netlify
Copy link

netlify bot commented Feb 20, 2026

Deploy Preview for hyprnote-storybook canceled.

Name Link
🔨 Latest commit e7eb1ce
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote-storybook/deploys/6997c9369d5e4d0008d7ed21

@netlify
Copy link

netlify bot commented Feb 20, 2026

Deploy Preview for hyprnote canceled.

Name Link
🔨 Latest commit e7eb1ce
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/6997c93673135a0008b34c8d

@yujonglee yujonglee merged commit b346232 into main Feb 20, 2026
14 of 15 checks passed
@yujonglee yujonglee deleted the devin/1771554914-crates-transcript branch February 20, 2026 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant